Characterizing Entanglement Sources 
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We discuss how to characterize entanglement sources with finite sets of measurements. The measurements do 
not have to be tomographically complete, and may consist of POVMs rather than von Neumann measurements. 
Our method yields a probability that the source generates an entangled state as well as estimates of any desired 
calculable entanglement measures, including their error bars. We apply two criteria, namely Akaike's infor- 
mation criterion and the Bayesian information criterion, to compare and assess different models (with different 
numbers of parameters) describing entanglement-generating devices. We discuss differences between standard 
entanglement-verificaton methods and our present method of characterizing an entanglement source. 
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I. INTRODUCTION 

Entanglement is useful, but hard to generate, and even 
harder to detect. The most measurement-intense approach to 
the problem of experimentally detecting the presence of en- 
tanglement is to perform complete quantum-state tomography 
01- Even for just two qubits this implies a reconstruction 
of all 15 independent elements of the corresponding density 
matrix. Subsequently applying the positive partial transpose 
(PPT) criterion to the reconstructed matrix gives a conclusive 
answer about entanglement or separability of the state |0, [3l • 

From the practical point of view it is desirable to have an 
entanglement detection tool that is more economical than full 
state tomography but nevertheless is decisive. Already in the 
original work on PPT [2] it was noticed that one can always 
construct an observable W with non-negative expectation val- 
ues for all separable states p s and a negative expectation value 
for at least one entangled state p e . In this way an experimen- 
tally detected violation of the inequality (W) > is a suffi- 
cient condition for entanglement. The observable W is called 
an entanglement witness (EW). There always exists an opti- 
mal choice of local orthogonal observables such that a given 
EW can be expressed as a sum of their direct products Jj], so 
that a witness can always be measured locally. The advantage 
of using EWs for entanglement detection will be appreciated 
better for multi-partite systems with more than two qubits, be- 
cause the number of tomographic measurements would grow 
exponentially with the number of qubits. On the other hand, 
a given witness does not detect all entangled states and there- 
fore a variety of different EWs should be tested in order to 
rule out false negative results. 

EWs assume the validity of quantum mechanics, and also 
assume one knows what measurements one is actually per- 
forming. A valuable alternative to EW can be sought in using 
a violation of Bell-CHSH inequalities |Hl 0] as a sufficient 
condition for entanglement (although a Bell-inequality test 
can be formulated as a witness, too [7]). Because Bell inequal- 
ities are derived from classical probability theory without any 
reference to quantum mechanics, no assumption about what is 
being measured is necessary. This method is, therefore, safe in 
the sense of avoiding many pitfalls arising from unwarranted 



(hidden) assumptions about one's experiment |8]. 

Here we propose a different method for characterizing an 
entanglement source that automatically takes into account fi- 
nite data as well as imperfect measurements. Our method con- 
sist of two parts. The first part, "Bayesian updating," produces 
an estimate of the relative probabilities that entangled and sep- 
arable states are consistent with a given finite set of data. This 
estimate depends on what a priori probability distribution (the 
prior) one chooses over all possible states (the more data one 
has, the less it depends on the prior). That is, there is an a 
priori probability of entanglement, and each single measure- 
ment updates this probability to an a posteriori probability 
of entanglement. The latter then has to be compared to the 
former, in order to reach the conclusion that one is now ei- 
ther more certain or less certain about having produced an en- 
tangled state. In fact, every experiment can only make such 
probabilistic statements about entanglement, although this is 
almost never explicitly stated in these terms. Thus our method 
differs from those in Refs [T(| [jjl which assume expecta- 
tion values of EWs are known [corresponding effectively to an 
infinite data set] and try to find the minimally-entangled state 
consistent with those expectation values. We use a numerical 
Bayesian updating method for a probability distribution over 
density matrices, which is similar to that recently discussed in 
Ref. IU2I1 in the context of quantum-state tomography. In par- 
ticular, whereas the reconstruction of a density matrix from 
experimental data is usually based on the maximum likelihood 
estimation (MLE), Ref. II 1211 discusses its drawbacks and pro- 
poses Bayesian updating in its stead as a superior method. Our 
aim, though, is not to give an estimate of the density matrix, 
but of entanglement. In fact, any quantity that can be calcu- 
lated from a density matrix, such as the purity of one's state, 
can be estimated this way. 

The second part of our method introduces two information 
criteria [13] to judge how different models of a given entangle- 
ment generation process can be compared to each other quan- 
titatively. It is probably best to explain this part by giving an 
example. For simplicity, we consider the case of two-qubit 
states. Suppose an experimentalist has a model for her entan- 
glement generating source that contains, say, two parameters 
describing two physically different sources of noise in the fi- 
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nal two-qubit state produced. She may try to fit her data to her 
two-parameter model, but obviously there are always states in 
the full 15-dimensional set of all physical states that will fit the 
data better. There are a number of criteria, standard in the lit- 
erature on statistical models, that compare quantitatively how 
different models fit the data. Here we will use Akaike's In- 
formation Criterion (AIC) and the Bayesian Information Cri- 
terion (BIC) lfl3ll . These information criteria aim to find the 
most informative model, not the best-fitting model. The idea 
is that a two-parameter model fitting the data almost as well as 
the full quantum-mechanical description would provide more 
physical insight and a more economical (think Occam's razor) 
and transparent description. Each of the two information cri- 
teria, AIC and BIC, produces a number ft. One term in is 
the logarithm of the maximum likelihood possible within each 
model, and the second term subtracts a penalty for each pa- 
rameter used in the model. The model with the larger value of 
is then deemed to be the more informative. We propose here 
to combine information criteria with the Bayesian updating 
methods for entanglement estimation. Namely, we propose 
to use the more informative model to generate a "substitute 
prior." In the case that the simpler model is the more informa- 
tive, the numerical efforts required for our Bayesian updating 
method are much smaller, and yet should lead to correct de- 
scriptions of the entanglement generated by one's source. 

This paper is organized as follows. In Section|II]we give a 
general formulation of our method of Bayesian updating ap- 
plicable to any quantum system. We also formulate precisely 
the two information criteria for model selection. In SectionHTll 
we discuss numerical examples, which illustrate the Bayesian 
methods and the information criteria. For concreteness we 
consider measurements of Bell-CHSH correlations (although 
any sort of measurements would do). The examples show that 
our method detects entangled two-qubit states that escape de- 
tection by any of the Bell-CHSH inequalities and even by vio- 
lations of the stronger version of these bounds, which we call 
Roy-Uffink-Seevinck bounds |S S El- 

In the Discussion 

and Conclusions Section we discuss the essential difference 
between our method of characterizing an entanglement source 
and the standard methods of entanglement verification l8l fT7ll . 



II. QUANTIFYING ENTANGLEMENT VIA BAYESIAN 
UPDATING 

Here we present a numerical Bayesian updating method for 
one's probability distribution over density matrices. A related 
Bayesian method was recently advocated in Ref. lfl2ll in the 
context of quantum-state tomography and quantum-state es- 
timation. We note our aim is not to give an estimate of the 
density matrix, but, more modestly, to give estimates of en- 
tanglement, purity, and in principle any quantity that can be 
efficiently calculated from the density matrix. We first discuss 
the method in general, and subsequently we propose a new 
method to choose a prior probability distribution over density 
matrices. 



A. Method 

The method itself can be formulated as a five-step proce- 
dure: 

1 . For a system of M qubits we first choose a finite test set 
of density matrices. We calculate the amount of entan- 
glement (in fact, the negativity) for each state in the set 
12311 . The a priori probability that our unknown experi- 
mentally generated state, which we denote by pi, equals 
a state p in the set is chosen as p pr ior(p) = l/N s , where 
N s is the number of states in the set. 

2. We assume some set of POVMs with elements {Hi} is 
measured. These POVMs can describe any (noisy) set 
of measurements one performs on the qubits. 

3. For the acquired measurement record d = {d\, ■ ■ ■ ,di} 
consisting of the number of times outcome i was ob- 
tained 12411 . we calculate the quantum-mechanical prob- 
ability p(d\p) that a given state p from the test set gen- 
erates the measurements outcome d (which follows di- 
rectly from Trplii). Having at hand probabilities p(d\p) 
for all states p in the test set we are now able to calcu- 
late the a priori probability p(d) — ^2p{d\p)/N s for 

p 

the measurement record d to occur. 

4. We calculate - using Bayes' rule - the probability 
p(p\d) ofhavingthe state p given the measurement out- 
comes d: p(p\d) — p(d\p)/[N s p(d)]. 

5. We obtain the posterior probability distribution over 
density matrices in our test set: p p0 sterior(p) := p(p\d) 
for all states p. 

We can then repeat steps 2-5 for a new set of measurements d, 
if needed. 

This procedure gives us, in step 5, a numerical estimate of 
the a posteriori probability that the unknown state p? equals 
the state p from the test set. From p(p\d) we can estimate the 
probability p e for the state p-> to be entangled. We just sum 
the probabilities p(p\d) for all entangled states p ent in the set 
i.e. 

pM= J2 p(^)- (!) 

P=Pent 

Furthermore, we can calculate probability distributions for 
any function of the density matrix, such as the negativity and 
purity. We thus infer expectation values such as 

N= P(p\ d ) N (p)> ( 2 ) 

P=Pcnt 

and 

P = 5>(pM)Tr(p 2 ), (3) 

p 



as well as standard deviations = V N 2 — N 2 etc. 
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The meaning of our final probability distribution p(p\d) and 
of the above expectation values is as follows. If we were 
forced to give a single density matrix that best describes all 
data and that includes error bars, we would give the mixed 
state p = J dpp(p\d)(p)p, as explained in [12]. The purity 
and negativity of the state p are not equal to (in fact, smaller 
than) the estimates N and P that we use here. The differ- 
ence is this: if one were to perform more measurements that 
are tomographically complete, N is the expected negativity of 
the final estimated density matrix. N(p), on the other hand, 
would be the useful entanglement of a single copy available 
without performing more measurements. For most quantum 
information processing purposes (such as teleportation) one 
indeed needs more precise knowledge about the density ma- 
trix than just its entanglement. See Ref. |8] for more discus- 
sions on this issue. 



B. Model testing and information criteria 

The only problem standing in the way of a straightforward 
application of the above Bayesian updating procedure is that a 
sufficiently dense test set (used in step 1) is in general too hard 
to handle numerically, since even for two-qubit density matri- 
ces the parameter space is 15-dimensional. Although there 
are certainly ways out of this problem (in particular, sampling 
directly from the posterior probability distribution can be effi- 
ciently done with the Metropolis-Hastings algorithm, see e.g. 
10]), here we stick to the idea of a set of test states by simpli- 
fying that set, as follows. 

As an illustrative example (which we will again consider in 
great detail in the next Section), consider an experimentalist 
trying to produce a maximally entangled Bell state of 2 qubits, 
say, (|00) + |ll))/\/2. She wants to test her entanglement- 
generating device by measuring some set of Bell correlations. 
In particular, suppose she measures 2 < K < 15 independent 
observables. 

From her previous experience with the same device, she 
models the generation process by assuming there is both 
Gaussian phase noise and white noise (mixing with the max- 
imally mixed state oc 1). That is, she assumes her device 
generates states of the form 

, 1 

Pp,<7=PP<T + (l-P)j, (4) 

withp G [0, 1] and 

p„ = l - J ^P(0)(|OO)+exp(#)|ll))((OO|+exp(-#)(ll|). 

(5) 

Here P(4>) is a Gaussian phase distribution of the form 

P(<t>) =N a eM-4> 2 /° 2 ) (6) 
with the normalization factor N a given by 



/^ 7r #exp(-0 2 /cr 2 ) 



So there are just two parameters the experimentalist has to 
determine from her measurement results, p and a. 

As a measure to judge how well her data d fit the model 
(lUl-©, she considers the best likelihood for that model, 

Pp. a = maxP(d\p p a ). (8) 

p,<7 

She would like to compare this to the maximum likelihood 
over all physical two-qubit states p, 

L a = max P(d\p). (9) 

p 

There are several ways to compare these two quantities lfl3ll . 
One criterion is called Akaike's Information Criterion, and it 
defines the quantity 

= log(L) - k (10) 

for each model, where k is the number of parameters in the 
model, and L is the maximum likelihood for the model. The 
quantity fl rewards a high value of the best likelihood (indicat- 
ing a good fit), but penalizes a large number of parameters (to 
guard against overfitting). Now when measuring 2 < K < 15 
observables, the best complete model contains just K, not 15, 
independent parameters. 

Thus the experimentalist would calculate two numbers 

Q Pt<7 = log(i P)CT ) - 2, 
fi a = log(L a )-K. (11) 

If ilp cr > il a then the Akaike Information Criterion judges 
the simple 2-parameter model to be more informative than the 
complete X-parameter model. 

There is a Bayesian version of this criterion lfl3ll . and it is 
defined in terms of similar quantities 

n' = log(L) - fclog(AT m )/2, (12) 

where L and k have the same meaning as before, and N m 
is the number of data taken. Again, if W p a > Q' m , the 2- 
parameter model is considered more informative than the K- 
parameter description. For N s > 8 the BIC puts a larger 
penalty on the number of parameters than does the AIC. 

In the case that the simple model turns out to be more infor- 
mative, according to at least one of the two criteria [this de- 
pends on the data], we propose that the experimentalist may 
well use the simple model to construct a test set of states. For 
example, she could assume as prior probability distributions 
forp and a thatp is uniformly distributed on the interval [0, 1], 
and that a is uniform on, say, the interval [0, it] (this is some- 
what arbitrary, of course, as every prior is). Then, the test set 
of states could be sampled by simply choosing N p uniformly 
spaced points in the interval [0, 1] for p and N a uniformly 
spaced points in the interval [0, w] for a, thus creating a test 
set of N s — N p ■ N a states. 

The above model leads to states that are diagonal in the Bell 
basis, 

|$x) = (|00) + |11))/V2 
|$ 2 ) = (|00)-|11))/V2 
|$ 3 ) = (|01) + |10))/V2 
|$ 4 > = (|01) - |10))/V2 (13) 
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In the next Section we thus consider not only the above two- 
parameter model, but also its obvious extension to a three- 
parameter model by allowing all Bell-diagonal states. 



m. EXAMPLES 

A. Orthogonal spin measurements 

In the following we consider, as an example, two-qubit 
states with spin measurements performed on each qubit (con- 
sidered as a spin- 1/2 system) in two arbitrary spatial directions 
that are orthogonal, and we denote the corresponding spin op- 
erators by A\ and A 2 for the first qubit, and B\ and B 2 for the 
second qubit. This consitutes a measurement of 8 indepen- 
dent quantities, four single-qubit expectation values and four 
correlations. 

We note that in this case we can construct four Bell-CHSH 
operators from the four measured correlations: 

61 := A 1 <g> (B 1 + B 2 ) + A 2 (g> {Bi - B 2 ), 

B 2 = A 1 (g>(B 1 +B 2 )~A 2 (g>{B 1 -B 2 ), 

B 3 = A 1 (g>(B 1 -B 2 ) + A 2 (g>{B 1 +B 2 ), 

Bi = A 1 (g>(-B 1 + B 2 ) + A 2 ®{B 1 +B 2 ). (14) 

We then test two-qubit states that may be entangled but that 
do not violate any of the four Bell inequalities that can be 
constructed from these four operators. In fact, we will not 
even optimize the choice of spatial directions, given an initial 
guess of what state should be produced, for violating a Bell 
inequality. 

Finally, we will add one more correlation to be measured, 
namely that involving the third dimension: A3B3. That is, 
whenever A3 is measured on the first qubit, B3 is measured 
on the second qubit. This addition makes the measurements 
on each qubit separately tomographically complete, but it does 
not lead to additional Bell-CHSH operators. The total number 
of independent observables measured in this case is 1 1 (four 
are missing). Thus, the parameter K to be used for evaluating 
fi a ofEq. (ID is if =11. 



B. Analytical results 

Determining the AIC and BIC criteria can be done ana- 
lytically in most cases that we will consider here. First of 
all, we can bound the maximum likelihood over all states, 
given the sort of measurements from the preceding subsec- 
tion. There are 20 observed frequencies, as follows: for 



each of the five correlation measurement A;Bj, where 



h3 



take on the values = (1, 1), (1, 2), (2, 1), (2, 2), (3, 3) 

there are four different outcomes, which we can denote by 
(+, +), (+, — ), (— , +), (— , — ). If we denote these frequen- 
cies by fijk, for k = 1 ... 4, then the (log of the) maximum 
likelihood is bounded by 

log(i.) < N »fioklog(f ijk ), (15) 



where Nij is the number of times the AiBj correlation was 
measured. If we assume all five correlations are measured 
equally often, then we have 



log(L ) < fijklog(fi jk ), 
k,(ij) 



(16) 



The bound is achieved when there is a physical state predict- 
ing the frequencies exactly as they were observed. 

Let us choose directions of our spin measurements as A\ = 
B 1 = X,A 2 = B 2 =Y and A 3 = B 3 = Z. Then, there are 
two obvious models an experimentalist could choose from: 
the first is the Bell-diagonal model, containing three parame- 
ters, in which states are of the form 



P 



(17) 



The observed frequencies f^k for the five correlations cannot 
be all predicted to arbitrary accuracy by Bell-diagonal states. 
In fact, most frequencies predicted by this model are indepen- 
dent of the values of {pi}, and are equal to 1/4. The only pre- 
dicted frequencies (which we denote by / so as to distinguish 
them from the observed frequencies /) that actually depend 
on the values of {p;} are 



/ill 


= /ll4 


= Pi/2- 


hps/2, 


/221 


= /224 




hp 3 /2, 


/331 


= /334 


= Px/2- 


hp 2 /2, 


/112 


= /ll3 


= P2/2- 


hp 4 /2, 


/222 


= /223 


= Pi/2- 


hp 4 /2, 


/332 


= /333 


= P 3 /2- 


VPi/2. 



(18) 



The best-fitting Bell-diagonal state can only predict the cor- 
rect correlations between XX, YY, and ZZ measurements. 
For example, there is a Bell-diagonal state predicting the cor- 
rect value for the sum /m + /114, but its prediction for the 
difference will always be zero. Thus, the Bell-diagonal state 
fitting the data best will have the following values for {pi}: 

Pi = [fill + /ll4 — /221 — /224 + /331 + /332]/2, 
P2 = [—fill — flU + /221 + /224 + /331 + /332]/2, 
P3 = [fill + /ll4 + /221 + /224 - /331 " /332]/2, (19) 

provided the observed frequencies are such that the {pi} in- 
cluding P4 — 1 —pi —p 2 —p3 are all nonnegative. In that case, 
the (log of the) maximum likelihood over all Bell-diagonal 
states is 



log L Bd 



5 



Y,(fui + fm)log([fui + fm}/2) 

i 

Y{fii2 + fii3) l0g([/«2 + /id/ 2 ) 



£ /^log(l/4) 



(20) 
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For the Bell-diagonal model we can construct a prior distribu- 
tion over Bell-diagonal states by choosing the numbers {p^ 
uniformly over the simplex, as explained in 02011 . 

The two-parameter model is similar in its predictions the 
the Bell-diagonal model. The only difference is that the pa- 
rameters p3 and p4 are equal. Thus this model can predict 
only two correlations correctly, namely ZZ and XX — YY. 
The maximum likelihood for this model, then, is given by 

N 

log L PiCT = -jP[(/ 3 31 + / 3 34) l0g([/ 3 31 + /334]/2) 
+ (/332 + /333)l0g([/332 + /333]/2) 
+ + Jll4 + /222 + /223) X 

log(l/4 + (/m + /114 - /221 - / 224 )/2) 

+ (/221 + /224 + /ll2 + /ll3) x 

log(l/4 + (/ 221 + / 224 - /in - /h 4 )/2) 
+ /«*Iog(l/4)], (21) 

provided all inferred frequencies are nonnegative. 

The three parameters to be used for selecting the most in- 
formative model are then, in the case of AIC: 

n a = \ogL a - ii, 

top,o = log Lp !(T - 2 

n B d = logisd - 3, (22) 
and similar expressions for the BIC. 

C. Numerics 

Let us first discuss the two-parameter substitute prior with 
p and a drawn uniformly from [0, 1] and [0, 7r], respectively. 
As our favorite entanglement monotone we use the negativity 
lfl9i l20ll . The prior probability distribution for negativity is 
displayed in Figures [TJ (the graph for concurrence is the same 
for his special case). The plot shows that states exist in the 
full range of separable to maximally entangled, with the prior 
probability of entanglement being P cnt = 50.3%. 

Using this prior, we consider the measurement of five dif- 
ferent Bell correlations. Sample results are displayed and dis- 
cussed in Figures [2^4] Figure [2] shows measurement results 
generated from an entangled state p? = po.4.o.4> as defined in 
Eq. ©. There is no need to test either AIC or BIC for this 
case, since the state is chosen from the two-parameter set of 
states, so the two-parameter model is trivially more informa- 
tive. The Bayesian posterior probability for entanglement dis- 
tribution is consistent with the actual entanglement properties 
of p?, as discussed in the Figure caption. 

We then also test a state that is just separable, the state 
Pi/3,i/3- The results can be summarized as "inconclusive" 
about the question whether the data inform the experimental- 
ist that the underlying state is entangled or not. This is not 
surprising given how close the actual state is to the separa- 
ble/entangled boundary. The plot for purity is not shown, as 
it is very similar to Figure [3] (the estimate of the purity is, 
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0.2 0.4 0.6 0.8 1 

Negativity 

FIG. 1 : Prior probability distribution of the negativity, for the two- 
parameter states p PtCr with p and a drawn uniformly from [0, 1] and 
[0, 7r], respectively. The values of the negativity are binned in 100 
bins for entangled states. One additional bin is reserved for separable 
states (of zero negativity). The latter point is not shown in this graph 
for visual reasons: the probability of separability is 49.7%. The total 
number of states drawn from the prior distribution is N p x N a = 
600 x 600. 



P = 0.331 ± 0.013 perfectly consistent with the actual purity 
of 0.3303 of pi/3,1/3)- 

Next we consider the following family of states 

Pk =0.5|^fc)(Vfc| + 0.51/4, (23) 

with k < 1 and 



\*l> k ) = (|00) + fe|ll))/Vl + fc 2 . (24) 

For k = 1 this state is in the two-parameter set, but for k < 1 
it is not. Obviously, the smaller k < 1 is, the less well it is 
approximated by a state p„ „. We investigate how well the 
two-parameter model does by calculating 

Afi' = Sl' Pia -n' a , (25) 

and tabulating the values for several values of fc < 1 in Table 
U We moreover give the estimated negativities and purities, 
plus their error bars, as compared to the actual values of those 
quantities for the states pk- What the table shows is that the 
two-parameter model ceases to be more informative when k < 
1 decreases. At that point, the estimate of negativity is still 
perfectly fine, but the estimate of the purity starts to fail. In the 
last entry, for fc = 0.5, the two-parameter's model's estimate 
of purity is definitely off by a large amount. 

In order to consider the three-parameter Bell-diagonal 
model, we first display the prior distribution for negativity of 
that model in Fig. [5] Next we us discuss a state that is not 
close to any state in the two-parameter set of states, but that is 
still reasonably well described by the three-parameter model, 

Pi =O.53^i)(^i|+O.47|0i)(0i|, (26) 



6 




0.35 



Negativity Negativity 



FIG. 2: The state considered here is of the form p p , a with p = 0.4 
and a = 0.4. Each of the five Bell correlation observables is 
measured 400 times, so that in total N m = 2000 measurements 
have been performed. Plotted is the posterior probability distribu- 
tion for the negativity. The results have been binned together in 
50 bins of equal size for entangled states, plus one extra bin for 
separable states (at zero negativity). The posterior probability for 
entanglement is 98% in this case (with 2% falling in the first bin 
at zero negativity). The estimated negativity and its error bar are 
TV = 0.082 ± 0.039, where the actual negativity of the state po. 4,0.4 
is N(poa,oa) = 0.0843. 
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FIG. 4: Same as Figure[2] but for the separable state px/3,i/3. The 
estimate obtained for the negativity is TV = 0.012, and its standard 
deviation is on = 0.020. The probability of entanglement is Pent = 
40.5%. The bin at zero negativity contains 59.5% probability, and 
that point is not plotted for visual reasons. 



k 


TV 


TV ± o-jv 


P±a P 


AO, 


AO' 


0.9 


0.247 


0.246 ± 0.024 


0.436 ±0.012 


7.2 


36 


0.8 


0.238 


0.237 ± 0.024 


0.431 ±0.012 


0.9 


30 


0.7 


0.220 


0.219 ±0.024 


0.423 ±0.012 


-11 


18 


0.6 


0.191 


0.190 ±0.024 


0.410 ±0.011 


-29 


0.8 


0.5 


0.150 


0.149 ± 0.025 


0.392 ±0.011 


-53 


-24 



TABLE I: Comparison, through the Akaike and Bayesian informa- 
tion criteria [using J25H, of the two-parameter model based on the 
family of states p p>lT [Eq. ©] and the full 15-parameter description 
of all two-qubit states, with measurement data generated from the 
family of states pk [Eq. J23H. Here the number of measurements 
is TV m = 5 x 1000. The purity of p k is equal to P = 0.4375 
for any value of < k < 1. For decreasing values of k, AO and 
Af2' decrease, indicating the two-parameter becomes less and less 
informative. The estimate of purity becomes, likewise, less and less 
reliable. 



0.25 0.3 0.35 0.4 0.45 

Purity 

FIG. 3: Same as Figure[2] but for the purity. The purity of the state 
Po.4,0.4 equal 0.3638, the estimate (plus error bar) obtained for the 
purity is P = 0.364 ± 0.016. The estimate of the purity is, relatively 
speaking, much more accurate than that of entanglement. 



with 

= (|00) ±0.9|ll))/vT8l, 
|<£i) = (|01) + 0.9|10»/VL81 (27) 

We consider N m — 5 x 1000 measurements, with each cor- 
relation being measured 1000 times. (For the calculations 
with the three-parameter model, a test set of size 10 7 was 
used. In contrast, for the two-parameter model, test sets of 



size 600 x 600 were sufficient in all cases. This illustrates 
that choosing a good physical model with as few parameters 
as possible pays large dividends.) For this state we calculate 
the AIC and BIC and compare the two- and three-parameter 
(Bell-diagonal) models to the full-state model, 

n Bd -n a = 2.4, 

Afl = VL p . a - Q = -462, (28) 
for the AIC, and 

n' m -n' a = 27, 

AQ' = n' p a - Q' a = -433, (29) 

for the BIC. That is, the three-parameter model is consid- 
ered more informative than the model containing all physical 
states. On the other hand, the two-parameter model is much 
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0.2 0.4 0.6 0.8 1 

Negativity 



FIG. 5: Prior probability distribution of the negativity, for the three- 
parameter set of Bell-diagonal states. The point at zero negativity is 
left out for visual reason: separable states occupy 50.0% of the total 
volume. Here 10 7 states were drawn from the prior distribution over 
states. 



less informative. The estimates for negativity and purity are, 
for the three-parameter model 

N = 0.059 ±0.022 

P B = 0.4977 ±0.0025 (30) 



For the two-parameter model, in contrast, we get 

N p = 0.056 ±0.025 

P P = 0.353 ±0.009 (32) 

so that again the purity estimated by the two-parameter model 
is way off, although the estimated negativity is still quite good. 
Thus, when the AIC and/or BIC criteria tell one not to trust a 
certain model, it does not imply that all estimated quantities 
from that model are, in fact, incorrect. 

Lest one starts to think that the two-parameter model in fact 
somehow always estimates the negativity correctly, even if the 
estimated purity is wrong, here is a counter example to that 
idea: when the N m = 5 x 1000 data are generated by the 
mixture 

p 2 = 0.53|^ 2 )(^ 2 | -F O.47|0 2 >(0 2 |, (33) 

with 

IV>2> - (|00)+0.5|11))/VL25, 

= (|01)+0.5|10»/VL25 (34) 

whose negativity is N = 0.039, the two-parameter model 
concludes the state is separable with high probability, P ent = 
3.1%, and N = 3.5 X_10" 4 ±0.0025 (and the estimated purity 
is incorrect as well: P — 0.319 ± 0.008 instead of the correct 
value P = 0.502). Here, Afl = -203. 



where the actual values are 



IV. DISCUSSION AND CONCLUSIONS 



N = 0.059 
P = 0.502. 



(31) 



Thus, both purity and negativity are estimated correctly within 
the three-parameter model; and this is what one would expect 
given the AIC and BIC criteria. The posterior probability dis- 
tribution for the negativity is plotted in Fig. [6] 



0.25 




0.2 0.3 
Negativity 



0.4 



0.5 



FIG. 6: Posterior probability distribution of the negativity, using the 
three-parameter set of Bell-diagonal states as prior, for data gener- 
ated from the state pi of Eq. ( 126b . 



We have demonstrated a method to characterize entangle- 
ment sources from finite sets of data, using Bayesian updating 
for the probability distribution over density matrices. One ob- 
tains a posterior probability distribution for any quantity that 
can be efficiently calculated from an arbitrary density matrix. 
For instance, one obtains a probability that one's state is en- 
tangled, as well as expectation values of any computable en- 
tanglement monotone, including estimates of statistical errors. 
These values should be compared to their a priori values to 
judge whether one's measurement results lead one to be more 
certain about entanglement or less. 

For two qubits it is in principle sufficient for the purpose of 
detecting entanglement to measure spin on each qubit in just 
two orthogonal directions. On the other hand, empirically, 
we found that for accurately quantifying two-qubit entangle- 
ment, adding one more correlation measurement is very ben- 
eficial. Thus we concentrated on discussing measurements of 
five spin-spin correlation functions. 

It is hard to say in general what sort of measurements, short 
of fully tomographic measurements, will be sufficient for es- 
timating what sort of quantities. An easy check, though, is 
to count by how many parameters a given quantity is deter- 
mined. For instance, purity is determined by the eigenvalues 
of the density matrix. Thus for two qubits one needs only 
three parameters. Thus, reliably estimating the purity of one's 
output states ought to be easier than estimating entanglement. 
Our simulations confirm this suspicion, producing relatively 
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smaller error bars for estimates of purity than for entangle- 
ment. 

It is important to note that in the above we used the phrase 
"characterizing entanglement sources," rather than "verifying 
entanglement," because the latter method, in its standard in- 
terpretation, has a different meaning: in entanglement veri- 
fication one tries to find a proof of entanglement convincing 
a skeptic outsider. But the Bayesian method rather describes 
one's own belief. In particular, the difference is that one's 
prior belief of the entanglement-generating source is certainly 
to be included in a Bayesian description, but in entanglement 
verification methods such beliefs are not allowed. Neverthe- 
less, Bayesian methods can be used for the stricter purpose of 
entanglement verification, as discussed in ll22ll . 

In order to characterize one's entanglement source, then, 
it is allowed to use a model describing one's source, based 
on, e.g., previous experiments and experiences with the same 
(or similar) device. We provided a criterion to judge whether 
a given model of one's source is more or less informative 
than other possible models. In particular, one can always 
parametrize the output states by using the full quantum- 
mechanical description of an arbitrary state of correct Hilbert- 
space dimension. The latter model, though, while being com- 
plete, may have more parameters than wished for or needed. 
Instead, one may be able to use a description of one's source 
in terms of a (small) number of physically relevant parame- 
ters. We proposed to use two criteria to judge the relative mer- 
its of such models, the Akaike Information Criterion (AIC), 



and the Bayesian Information Criterion (BIC) lfl3ll . We then 
showed how the AIC and BIC can be used to choose a test set 
of states i.e., an a priori probability distribution over quan- 
tum states generated by one's source: a Bayesian method, of 
course, only produces probabilities of entanglement by first 
choosing a prior. 

If a simple model described one's source very well, then 
one's test set can be based on that model. We applied the 
AIC and BIC criteria to several examples, all involving two 
qubits, and showed that indeed, such criteria indicate whether 
model's predictions about purity and entanglement of the out- 
put of the source (including a probability that one's output 
state is entangled, as well as an estimate of the amount of en- 
tanglement) can be expected to be reliable or not. We demon- 
strated this by showing that certain estimates produced from 
a simple model are wrong if the information criteria deem the 
model to be less informative than the full 15d description of 
two-qubit quantum states, whereas those estimates are right 
on the mark, when the criteria deem the simple model to be 
more informative. 
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